Graph Connectivity Measures for Unsupervised Parameter Tuning of Graph-Based Sense Induction Systems.
نویسندگان
چکیده
Word Sense Induction (WSI) is the task of identifying the different senses (uses) of a target word in a given text. This paper focuses on the unsupervised estimation of the free parameters of a graph-based WSI method, and explores the use of eight Graph Connectivity Measures (GCM) that assess the degree of connectivity in a graph. Given a target word and a set of parameters, GCM evaluate the connectivity of the produced clusters, which correspond to subgraphs of the initial (unclustered) graph. Each parameter setting is assigned a score according to one of the GCM and the highest scoring setting is then selected. Our evaluation on the nouns of SemEval-2007 WSI task (SWSI) shows that: (1) all GCM estimate a set of parameters which significantly outperform the worst performing parameter setting in both SWSI evaluation schemes, (2) all GCM estimate a set of parameters which outperform the Most Frequent Sense (MFS) baseline by a statistically significant amount in the supervised evaluation scheme, and (3) two of the measures estimate a set of parameters that performs closely to a set of parameters estimated in supervised manner.
منابع مشابه
Detecting Compositionality in Multi-Word Expressions
Identifying whether a multi-word expression (MWE) is compositional or not is important for numerous NLP applications. Sense induction can partition the context of MWEs into semantic uses and therefore aid in deciding compositionality. We propose an unsupervised system to explore this hypothesis on compound nominals, proper names and adjective-noun constructions, and evaluate the contribution of...
متن کاملGraph Connectivity Measures for Unsupervised Word Sense Disambiguation
Word sense disambiguation (WSD) has been a long-standing research objective for natural language processing. In this paper we are concerned with developing graph-based unsupervised algorithms for alleviating the data requirements for large scale WSD. Under this framework, finding the right sense for a given word amounts to identifying the most “important” node among the set of graph nodes repre...
متن کاملA Survey on Complexity of Integrity Parameter
Many graph theoretical parameters have been used to describe the vulnerability of communication networks, including toughness, binding number, rate of disruption, neighbor-connectivity, integrity, mean integrity, edgeconnectivity vector, l-connectivity and tenacity. In this paper we discuss Integrity and its properties in vulnerability calculation. The integrity of a graph G, I(G), is defined t...
متن کاملSecure Communication in Shotgun Cellular Systems
In this paper, we analyze the secure connectivity in Shotgun cellular systems (SCS: Wireless communication systems with randomly placed base stations) by Poisson intrinsically secure communication graph (IS-graph), i.e., a random graph which describes the connections that are secure over a network. For a base-station in SCS, a degree of secure connections is determined over two channel models: ...
متن کاملNoun Sense Induction and Disambiguation using Graph-Based Distributional Semantics
We introduce an approach to word sense induction and disambiguation. The method is unsupervised and knowledge-free: sense representations are learned from distributional evidence and subsequently used to disambiguate word instances in context. These sense representations are obtained by clustering dependency-based secondorder similarity networks. We then add features for disambiguation from het...
متن کامل